NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Visually Consistent Hierarchical Image Classification

Park, Seulki; Zhang, Youren; Yu, Stella X; Beery, Sara; Huang, Jonathan (April 2025, International Conference on Learning Representations)

Full Text Available
Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Kay, Justin; Haucke, Timm; Stathatos, Suzanne; Deng, Siqi; Young, Erik; Perona, Pietro; Beery, Sara; Van_Horn, Grant (March 2025, Transactions on machine learning research)

Object detectors often perform poorly on data that differs from their training set. Domain adaptive object detection (DAOD) methods have recently demonstrated strong results on addressing this challenge. Unfortunately, we identify systemic benchmarking pitfalls that call past results into question and hamper further progress: (a) Overestimation of performance due to underpowered baselines, (b) Inconsistent implementation practices preventing transparent comparisons of methods, and (c) Lack of generality due to outdated backbones and lack of diversity in benchmarks. We address these problems by introducing: (1) A unified benchmarking and implementation framework, Align and Distill (ALDI), enabling comparison of DAOD methods and supporting future development, (2) A fair and modern training and evaluation protocol for DAOD that addresses benchmarking pitfalls, (3) A new DAOD benchmark dataset, CFC-DAOD, increasing the diversity of available DAOD benchmarks, and (4) A new method, ALDI++, that achieves state-of-the-art results by a large margin. ALDI++ outperforms the previous state-of-the-art by +3.5 AP50 on Cityscapes Foggy Cityscapes, +5.7 AP50 on Sim10k Cityscapes (where ours is the only method to outperform a fair baseline), and +0.6 AP50 on CFC-DAOD. ALDI and ALDI++ are architecture-agnostic, setting a new state-of-the-art for YOLO and DETR-based DAOD as well without additional hyperparameter tuning. Our framework, dataset, and method offer a critical reset for DAOD and provide a strong foundation for future research.
more » « less
Full Text Available
Harnessing artificial intelligence to fill global shortfalls in biodiversity knowledge

https://doi.org/10.1038/s44358-025-00022-3

Pollock, Laura J; Kitzes, Justin; Beery, Sara; Gaynor, Kaitlyn M; Jarzyna, Marta A; Mac_Aodha, Oisin; Meyer, Bernd; Rolnick, David; Taylor, Graham W; Tuia, Devis; et al (March 2025, Nature Reviews Biodiversity)

Large, well described gaps exist in both what we know and what we need to know to address the biodiversity crisis. Artificial intelligence (AI) offers new potential for filling these knowledge gaps, but where the biggest and most influential gains could be made remains unclear. To date, biodiversity-related uses of AI have largely focused on tracking and monitoring of wildlife populations. Rapid progress is being made in the use of AI to build phylogenetic trees and species distribution models. However, AI also has considerable unrealized potential in the re-evaluation of important ecological questions, especially those that require the integration of disparate and inherently complex data types, such as images, video, text, audio and DNA. This Review describes the current and potential future use of AI to address seven clearly defined shortfalls in biodiversity knowledge. Recommended steps for AI-based improvements include the re-use of existing image data and the development of novel paradigms, including the collaborative generation of new testable hypotheses. The resulting expansion of biodiversity knowledge could lead to science spanning from genes to ecosystems — advances that might represent our best hope for meeting the rapidly approaching 2030 targets of the Global Biodiversity Framework.
more » « less
Full Text Available
Monitoring Social Insect Activity with Minimal Human Supervision

https://doi.org/10.1109/CVPRW63382.2024.00131

Sharma, Tarun; Wagner, Julian M; Beery, Sara; Dickson, William B; Dickinson, Michael H; Parker, Joseph (June 2024, IEEE)

Full Text Available
A landmark environmental law looks ahead--The Prospect of Using Gene Editing for Deliberate Extinction

https://doi.org/10.1126/science.adn3245

Fischman, Robert L; Ruhl, J B; Forester, Brenna R; Lama, Tanya M; Kardos, Marty; Rojas, Grethel Aguilar; Robinson, Nicholas A; Shirey, Patrick D; Lamberti, Gary A; Ando, Amy W; et al (December 2023, Science)

In late December 1973, the United States enacted what some would come to call “the pitbull of environmental laws.” In the 50 years since, the formidable regulatory teeth of the Endangered Species Act (ESA) have been credited with considerable successes, obliging agencies to draw upon the best available science to protect species and habitats. Yet human pressures continue to push the planet toward extinctions on a massive scale. With that prospect looming, and with scientific understanding ever changing,Scienceinvited experts to discuss how the ESA has evolved and what its future might hold.—Brad Wible
more » « less
Full Text Available
A landmark environmental law looks ahead: Updating practices for the genomic era

Fischman, Robert L; Ruhl, J B; Forester, Brenna R; Lama, Tanya M; Kardos, Marty; Rojas, Grethel Aguilar; Robinson, Nicholas A; Shirey, Patrick D; Lamberti, Gary A; Ando, Amy W; et al (December 2023, Science)

In late December 1973, the United States enacted what some would come to call “the pitbull of environmental laws.” In the 50 years since, the formidable regulatory teeth of the Endangered Species Act (ESA) have been credited with considerable successes, obliging agencies to draw upon the best available science to protect species and habitats. Yet human pressures continue to push the planet toward extinctions on a massive scale. With that prospect looming, and with scientific understanding ever changing, Science invited experts to discuss how the ESA has evolved and what its future might hold.
more » « less
Full Text Available
Perspectives in machine learning for wildlife conservation

https://doi.org/10.1038/s41467-022-27980-y

Tuia, Devis; Kellenberger, Benjamin; Beery, Sara; Costelloe, Blair R.; Zuffi, Silvia; Risse, Benjamin; Mathis, Alexander; Mathis, Mackenzie W.; van Langevelde, Frank; Burghardt, Tilo; et al (December 2022, Nature Communications)

Abstract Inexpensive and accessible sensors are accelerating data acquisition in animal ecology. These technologies hold great potential for large-scale ecological understanding, but are limited by current processing approaches which inefficiently distill data into relevant information. We argue that animal ecologists can capitalize on large datasets generated by modern sensors by combining machine learning approaches with domain knowledge. Incorporating machine learning into ecological workflows could improve inputs for ecological models and lead to integrated hybrid modeling tools. This approach will require close interdisciplinary collaboration to ensure the quality of novel approaches and train a new generation of data scientists in ecology and conservation.
more » « less
Full Text Available
Extending the WILDS Benchmark for Unsupervised Adaptation

Sagawa, Shiori; Koh, Pang Wei; Lee, Tony; Gao, Irena; Xie, Sang Michael; Shen, Kendrick; Kumar, Ananya; Hu, Weihua; Yasunaga, Michihiro; Marklund, H.; et al (January 2022, International Conference on Learning Representations)

Machine learning systems deployed in the wild are often trained on a source distribution but deployed on a different target distribution. Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well. However, existing distribution shift benchmarks with unlabeled data do not reflect the breadth of scenarios that arise in real-world applications. In this work, we present the WILDS 2.0 update, which extends 8 of the 10 datasets in the WILDS benchmark of distribution shifts to include curated unlabeled data that would be realistically obtainable in deployment. These datasets span a wide range of applications (from histology to wildlife conservation), tasks (classification, regression, and detection), and modalities (photos, satellite images, microscope slides, text, molecular graphs). The update maintains consistency with the original WILDS benchmark by using identical labeled training, validation, and test sets, as well as the evaluation metrics. On these datasets, we systematically benchmark state-of-the-art methods that leverage unlabeled data, including domain-invariant, self-training, and self-supervised methods, and show that their success on WILDS is limited. To facilitate method development and evaluation, we provide an open-source package that automates data loading and contains all of the model architectures and methods used in this paper. Code and leaderboards are available at this https URL.
more » « less
Full Text Available
Extending the WILDS Benchmark for Unsupervised Adaptation

Sagawa, Shiori; Koh, Pang Wei; Lee, Tony; Gao, Irene; Xie, Sang Michael; Shen, Kendrick; Kumar, Ananya; Hu, Weihua; Yasunaga, Michihiro; Marklund, Henrik; et al (January 2022, International Conference on Learning Representations (ICLR))

Machine learning systems deployed in the wild are often trained on a source distribution but deployed on a different target distribution. Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well. However, existing distribution shift benchmarks with unlabeled data do not reflect the breadth of scenarios that arise in real-world applications. In this work, we present the WILDS 2.0 update, which extends 8 of the 10 datasets in the WILDS benchmark of distribution shifts to include curated unlabeled data that would be realistically obtainable in deployment. These datasets span a wide range of applications (from histology to wildlife conservation), tasks (classification, regression, and detection), and modalities (photos, satellite images, microscope slides, text, molecular graphs). The update maintains consistency with the original WILDS benchmark by using identical labeled training, validation, and test sets, as well as identical evaluation metrics. We systematically benchmark state-of-the-art methods that use unlabeled data, including domain-invariant, self-training, and self-supervised methods, and show that their success on WILDS is limited. To facilitate method development, we provide an open-source package that automates data loading and contains the model architectures and methods used in this paper.
more » « less
Full Text Available
WILDS: A Benchmark of in-the-Wild Distribution Shifts

Koh, Pang Wei; Sagawa, Shiori; Marklund, Henrik; Xie, Sang Michael; Zhang, Marvin; Balsubramani, Akshay; Hu, Weihua; Yasunaga, Michihiro; Phillips, Richard Lanas; Gao, Irena; et al (January 2021, Proceedings of Machine Learning Research)
null (Ed.)
Distribution shifts—where the training distribution differs from the test distribution—can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild. Despite their ubiquity in the real-world deployments, these distribution shifts are under-represented in the datasets widely used in the ML community today. To address this gap, we present WILDS, a curated benchmark of 10 datasets reflecting a diverse range of distribution shifts that naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. On each dataset, we show that standard training yields substantially lower out-of-distribution than in-distribution performance. This gap remains even with models trained by existing methods for tackling distribution shifts, underscoring the need for new methods for training models that are more robust to the types of distribution shifts that arise in practice. To facilitate method development, we provide an open source package that automates dataset loading, contains default model architectures and hyperparameters, and standardizes evaluations. The full paper, code, and leaderboards are available at https://wilds.stanford.edu.
more » « less
Full Text Available

Search for: All records